Quote Clustering in Online News
نویسندگان
چکیده
The notion that information moves through social networks has been widely discussed[3], however, with the growing availability of large digital corpora, the ability to quantitatively model this phenomenon is new. To this end we explore a large corpus of online news quotations looking for cases of noisy reproduction and the factors which influence such noise. An essential step in this process is distinguishing mutational variants from entirely independent but similar quotations. Given that the question of whether two quotes really were derived from the same original utterance cannot be known with complete certainty, it is not immediately apparent how to make progress on such a task. Our appraoch is twofold: On the one hand we attempt to frame the problem in terms of supervised learning, annotating data using Mechanical Turk. On the other hand, we approach the problem more from the perspective of unsupervised clustering, projecting the data into a variety of metrics which allows us to test and extend our linguistic intuitions about the dataset.
منابع مشابه
A Comparative Review of Hijab Discovery News Coverage in News Media
Purpose: News media play an important role in attitude towards various issues including hijab and hijab discovery. As a result, the purpose of this research was comparative review of hijab discovery news coverage in news media. Methodology: This study in terms of purpose was applied and in terms of implementation method was quantitative. The research population was the hijab discovery news in ...
متن کاملPENETRATE: Personalized news recommendation using ensemble hierarchical clustering
Recommending online news articles has become a promising research direction as the Internet provides fast access to real-time information from multiple sources around the world. Many online readers have their own reading preference on news articles; however, a group of users might be interested in similar fascinating topics. It would be helpful to take into consideration the individual and grou...
متن کاملThe Representation of Social Actors in the Graduate Employability Issue: Online News and the Government Document
This paper presents the first part of a larger study on the issue of graduate employability in Malaysia as construed in public discourse in English, a language of power in Malaysia. The term employability itself has many definitions depending on the requirements of government and industry, and in the case of Malaysia, the English-language ability of graduates is inseparable from graduate employ...
متن کاملExamining the Associations of Covid-19 Vaccine News Sources with the Intention of Changing Adherence to Covid-19 Preventive Health Measures: A Online-Based Study in the North of Iran
Background: Although the scientific literature has extensively discussed the impact of the media on people’s health-related behaviors, there is little evidence on the effect of different sources of Covid-19 vaccine news on changing the intention to adhere to health protocols. Therefore, the present study was conducted to investigate the news sources of Covid vaccine 19 and the association of ea...
متن کاملClustering-Based Searching and Navigation in an Online News Source
The growing amount of online news posted on the WWW demands new algorithms that support topic detection, search, and navigation of news documents. This work presents an algorithm for topic detection that considers the temporal evolution of news and the structure of web documents. Then, it uses the results of the topic detection algorithm for searching and navigating in an online news source. An...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010